{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "    <img src=\"./images/dagang3.png\" width=800>\n",
    "</center>\n",
    "\n",
    "## 4.1 简介\n",
    "\n",
    "图像的实质是一种二维信号，滤波是信号处理中的一个重要概念。在图像处理中，滤波是一种非常常见的技术，它们的原理非常简单，但是其思想却十分值得借鉴，滤波是很多图像算法的前置步骤或基础，掌握图像滤波对理解卷积神经网络也有一定帮助。\n",
    "\n",
    "## 4.2 图像滤波原理\n",
    "<center>\n",
    "    <img src=\"./images/conv.png\" width=800>\n",
    "</center>\n",
    "\n",
    "\n",
    "上图蓝色点可以解释为：\n",
    "\n",
    "$$6.5 + 9.8 + 12.3 + 6.5 + 19.2 + 11.5 + 6.3 + 9.1 + 10.7=91.9 \\approx 92$$\n",
    "\n",
    "线性滤波处理的输出像素值$g(i,j)$是输入像素值$f(i+k,j+l)$的加权和 :\n",
    "\n",
    "$$g(i, j)=\\sum_{k,l}f(i+k, j+l)h(k, l)$$\n",
    "\n",
    "\n",
    "其中的加权和为 ，我们称其为“核”，滤波器的加权系数，即滤波器的“滤波系数”。\n",
    "上面的式子可以简单写作：\n",
    "\n",
    "$$g=f\\otimes h$$\n",
    "\n",
    "\n",
    "其中$f$表示输入像素值，$h$表示加权系数“核“，$g$表示输出像素值"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4.4.1卷积后图像大小问题\n",
    "\n",
    "$$⌊(n+2\\times p−k+s)/s⌋ = ⌊(n+2 \\times p−k)/s⌋ + 1$$\n",
    "\n",
    "其中$n$为图片长（宽），$p$为padding大小，$k$为卷积核长（宽），$s$为步长\n",
    "\n",
    "### 4.4.2填充问题\n",
    "\n",
    "在对图像应用滤波器进行过滤时，边界问题是一个需要处理的问题。一般来说，有4种处理的方法。\n",
    "\n",
    "1. 填充0\n",
    "\n",
    "对图像的边界做扩展，在扩展边界中填充0，对于边长为2k+1的方形滤波器，扩展的边界大小为k，若原来的图像为[m, n]，则扩展后图像变为[m+2k, n+2k]。\n",
    "\n",
    "2. 填充固定值\n",
    "扩展与填充0的扩展类似,只不过将填充0改为填充固定值\n",
    "\n",
    "3. 填充最近像素值\n",
    "\n",
    "扩展与填充0的扩展类似，只不过填充0的扩展是在扩展部分填充0，而这个方法是填充距离最近的像素的值。\n",
    "    \n",
    "4. 镜像填充ReflectionPad2d\n",
    "\n",
    "根据边缘进行镜像对称填充"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "ename": "ModuleNotFoundError",
     "evalue": "No module named 'torch'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mModuleNotFoundError\u001b[0m                       Traceback (most recent call last)",
      "Input \u001b[1;32mIn [1]\u001b[0m, in \u001b[0;36m<cell line: 1>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mtorch\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m nn\n\u001b[0;32m      2\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mtorch\u001b[39;00m\n\u001b[0;32m      4\u001b[0m x \u001b[38;5;241m=\u001b[39m torch\u001b[38;5;241m.\u001b[39marange(\u001b[38;5;241m9\u001b[39m, dtype\u001b[38;5;241m=\u001b[39mtorch\u001b[38;5;241m.\u001b[39mfloat)\u001b[38;5;241m.\u001b[39mreshape(\u001b[38;5;241m1\u001b[39m, \u001b[38;5;241m1\u001b[39m, \u001b[38;5;241m3\u001b[39m, \u001b[38;5;241m3\u001b[39m)\n",
      "\u001b[1;31mModuleNotFoundError\u001b[0m: No module named 'torch'"
     ]
    }
   ],
   "source": [
    "from torch import nn\n",
    "import torch\n",
    "\n",
    "x = torch.arange(9, dtype=torch.float).reshape(1, 1, 3, 3)\n",
    "p = nn.ReflectionPad2d(2)\n",
    "p(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.3 线性滤波\n",
    "\n",
    "### 4.3.1 均值滤波\n",
    "\n",
    "**均值滤波的应用场合：**\n",
    "根据冈萨雷斯书中的描述，均值模糊可以模糊图像以便得到感兴趣物体的粗略描述，也就是说，去除图像中的不相关细节，其中“不相关”是指与滤波器模板尺寸相比较小的像素区域，从而对图像有一个整体的认知。即为了对感兴趣的物体得到一个大致的整体的描述而模糊一幅图像，忽略细小的细节。\n",
    "\n",
    "**均值滤波的缺陷：**\n",
    "均值滤波本身存在着固有的缺陷，即它不能很好地保护图像细节，在图像去噪的同时也破坏了图像的细节部分，从而使图像变得模糊，不能很好地去除噪声点，特别是椒盐噪声。\n",
    "\n",
    "均值滤波方法是：对待处理的当前像素，选择一个模板，该模板为其邻近的若干个像素组成，用模板的均值（方框滤波归一化）来替代原像素的值。公式表示为：\n",
    "\n",
    "<div align=center><img   src =\"https://img-blog.csdnimg.cn/2020041213054619.png\"/></div>           \n",
    "\n",
    "g(x,y)为该邻域的中心像素，n跟系数模版大小有关，一般3*3邻域的模板，n取为9，如：\n",
    "\n",
    "$$\\left(\\begin{matrix}\n",
    "\\frac{1}{9} & \\frac{1}{9} & \\frac{1}{9} \\\\\n",
    "\\frac{1}{9} & \\frac{1}{9} & \\frac{1}{9} \\\\\n",
    "\\frac{1}{9} & \\frac{1}{9} & \\frac{1}{9}\n",
    "\\end{matrix}\\right)$$       \n",
    "\n",
    "当然，模板是可变的，一般取奇数，如5 * 5 , 7 * 7等等。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "ename": "ModuleNotFoundError",
     "evalue": "No module named 'tensorbay'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mModuleNotFoundError\u001b[0m                       Traceback (most recent call last)",
      "Input \u001b[1;32mIn [2]\u001b[0m, in \u001b[0;36m<cell line: 3>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      1\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mnumpy\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mnp\u001b[39;00m\n\u001b[0;32m      2\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mcv2\u001b[39;00m\n\u001b[1;32m----> 3\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mtensorbay\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m GAS\n\u001b[0;32m      4\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mtensorbay\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mdataset\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m Segment\n\u001b[0;32m      6\u001b[0m \u001b[38;5;66;03m# Authorize a GAS client.\u001b[39;00m\n",
      "\u001b[1;31mModuleNotFoundError\u001b[0m: No module named 'tensorbay'"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import cv2\n",
    "from tensorbay import GAS\n",
    "from tensorbay.dataset import Segment\n",
    "\n",
    "# Authorize a GAS client.\n",
    "gas = GAS('Accesskey-dac95e0d8e685ef4d5f8b80d51e38499')\n",
    "\n",
    "# Get a dataset client.\n",
    "\n",
    "dataset_client = gas.get_dataset(\"DogsVsCats-2\")\n",
    "\n",
    "# List dataset segments.\n",
    "segments = dataset_client.list_segment_names()\n",
    "\n",
    "# Get a segment by name\n",
    "segment = Segment(\"train\", dataset_client)\n",
    "fp = segment[1].open()\n",
    "with open(\"a.png\", \"wb\") as f:\n",
    "    f.write(fp.read()) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "def Conv(img, kernel, stride=1, padding=0):\n",
    "    '''\n",
    "    img: 输入图片\n",
    "    kernel: 卷积核\n",
    "    stride: 步长\n",
    "    padding: 填充图片，这里用零来填充\n",
    "    '''\n",
    "    h, w = img.shape[0], img.shape[1]\n",
    "    img = img.reshape(h, w, -1)\n",
    "    c = img.shape[2]\n",
    "    if padding > 0:\n",
    "        img_p = np.zeros((h+2*padding, w+2*padding, c))\n",
    "        img_p[padding:-padding, padding:-padding, :] = img\n",
    "    else:\n",
    "        img_p = img\n",
    "    h_k, w_k = kernel.shape\n",
    "    h, w = img_p.shape[0], img_p.shape[1]\n",
    "    h_dst, w_dst = int((h - h_k + stride)/stride), int((w - w_k + stride)/stride)\n",
    "    img_dest = np.zeros((h_dst, w_dst, c))\n",
    "    for i in range(h_dst):\n",
    "        for j in range(w_dst):\n",
    "            for k in range(c):\n",
    "                img_dest[i, j, k] = np.sum(img_p[i*stride:i*stride+h_k, j*stride:j*stride+w_k, k] * kernel)\n",
    "    return img_dest"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "def noise(img, snr):\n",
    "    h = img.shape[0]\n",
    "    w = img.shape[1]\n",
    "    img1 = img.copy()\n",
    "    num = int(h * w * (1 - snr))\n",
    "    for i in range (num):\n",
    "        randx = np.random.randint(0, h)\n",
    "        randy = np.random.randint(0, w)\n",
    "        if np.random.random()<=0.5:\n",
    "            img1[randx,randy]=0\n",
    "        else:\n",
    "            img1[randx,randy]=255\n",
    "    return img1\n",
    "\n",
    "img = cv2.imread('a.png', 1)\n",
    "img_n = noise(img, 0.9)\n",
    "cv2.imshow('n', img_n)\n",
    "kernel = (1/100) * np.ones((10, 10))\n",
    "img_blur = Conv(img_n, kernel).astype(np.uint8)\n",
    "cv2.imshow('b', img_blur)\n",
    "key = cv2.waitKey()\n",
    "if key == 27:\n",
    "    cv2.destroyAllWindows()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4.3.2 高斯滤波\n",
    "\n",
    "**应用：** 高斯滤波是一种线性平滑滤波器，对于服从正态分布的噪声有很好的抑制作用。在实际场景中，我们通常会假定图像包含的噪声为高斯白噪声，所以在许多实际应用的预处理部分，都会采用高斯滤波抑制噪声，如传统车牌识别等。\n",
    "\n",
    "高斯滤波和均值滤波一样，都是利用一个掩膜和图像进行卷积求解。不同之处在于：均值滤波器的模板系数都是相同的，而高斯滤波器的模板系数，则随着距离模板中心的增大而系数减小（服从二维高斯分布）。所以，高斯滤波器相比于均值滤波器对图像个模糊程度较小，更能够保持图像的整体细节。\n",
    "\n",
    "二维高斯分布\n",
    "\n",
    "<div align=center><img   src =\"https://img-blog.csdnimg.cn/20200412165241878.png\"/></div>     \n",
    "\n",
    "其中不必纠结于系数，因为它只是一个常数！并不会影响互相之间的比例关系，并且最终都要进行归一化，所以在实际计算时我们是忽略它而只计算后半部分:\n",
    "\n",
    "<div align=center><img   src =\"https://img-blog.csdnimg.cn/20200412165426976.png\"/></div>     \n",
    "\n",
    "其中(x,y)为掩膜内任一点的坐标，(ux,uy)为掩膜内中心点的坐标，在图像处理中可认为是整数；σ是标准差。\n",
    "\n",
    "例如：要产生一个3×3的高斯滤波器模板，以模板的中心位置为坐标原点进行取样。模板在各个位置的坐标，如下所示（x轴水平向右，y轴竖直向下）。  \n",
    "\n",
    "<div align=center><img   src =\"https://img-blog.csdnimg.cn/20200412165500464.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80MDY0NzgxOQ==,size_16,color_FFFFFF,t_70\"/></div>     \n",
    "\n",
    "这样，将各个位置的坐标带入到高斯函数中，得到的值就是模板的系数。\n",
    "\n",
    "**生成高斯掩膜（小数形式）**  \n",
    "知道了高斯分布原理，实现起来也就不困难了。\n",
    "\n",
    "首先我们要确定我们生产掩模的尺寸wsize，然后设定高斯分布的标准差。生成的过程，我们首先根据模板的大小，找到模板的中心位置center。 然后就是遍历，根据高斯分布的函数，计算模板中每个系数的值。\n",
    "\n",
    "最后模板的每个系数要除以所有系数的和。这样就得到了小数形式的模板。由于中心点为（0, 0）点，故公式变为：\n",
    "\n",
    "$$f(x, y) = e^{-\\frac{x^2 + y^2}{2\\sigma ^2}}$$\n",
    "\n",
    "\n",
    "**3×3,σ=0.8的小数型模板：**   \n",
    "\n",
    "<div align=center><img   src =\"https://img-blog.csdnimg.cn/20200412165756162.png\"/></div>     \n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def imgGaussian(ksize, sigma):\n",
    "    '''\n",
    "    :ksize：卷积核大小，必须是奇数\n",
    "    :param sigma: σ标准差\n",
    "    :return: 高斯滤波器的模板\n",
    "    '''\n",
    "    assert ksize%2 == 1\n",
    "    gaussian_mat = np.zeros((ksize, ksize))\n",
    "    k = int(ksize/2)\n",
    "    for x in range(-k, k+1):\n",
    "        for y in range(-k, k + 1):\n",
    "            gaussian_mat[x + k][y + k] = np.exp(- (x ** 2 + y ** 2) / (2 * sigma ** 2))\n",
    "    return gaussian_mat / gaussian_mat.sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**σ的意义及选取**   \n",
    "通过上述的实现过程，不难发现，高斯滤波器模板的生成最重要的参数就是高斯分布的标准差σ。标准差代表着数据的离散程度，如果σ较小，那么生成的模板的中心系数较大，而周围的系数较小，这样对图像的平滑效果就不是很明显；反之，σ较大，则生成的模板的各个系数相差就不是很大，比较类似均值模板，对图像的平滑效果比较明显。\n",
    "\n",
    "来看下一维高斯分布的概率分布密度图：\n",
    "\n",
    "<div align=center><img   src =\"https://img-blog.csdnimg.cn/20200412165824372.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80MDY0NzgxOQ==,size_16,color_FFFFFF,t_70\"/></div>     \n",
    "\n",
    "\n",
    "于是我们有如下结论：σ越小分布越瘦高，σ越大分布越矮胖。\n",
    "\n",
    "* σ越大，分布越分散，各部分比重差别不大，于是生成的模板各元素值差别不大，类似于平均模板；\n",
    "* σ越小，分布越集中，中间部分所占比重远远高于其他部分，反映到高斯模板上就是中心元素值远远大于其他元素值，于是自然而然就相当于中间值得点运算。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "kernel = imgGaussian(5, 7)\n",
    "img_b = Conv(img_n, kernel, padding=0).astype(np.uint8)  # 高斯模糊\n",
    "cv2.imshow('b', img_b)\n",
    "key = cv2.waitKey()\n",
    "if key == 27:\n",
    "    cv2.destroyAllWindows()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.4 非线性滤波\n",
    "\n",
    "### 4.4.1 中值滤波\n",
    "\n",
    "中值滤波是一种典型的非线性滤波，是基于排序统计理论的一种能够有效抑制噪声的非线性信号处理技术，基本思想是用像素点邻域灰度值的中值来代替该像素点的灰度值，让周围的像素值接近真实的值从而消除孤立的噪声点。该方法在取出脉冲噪声、椒盐噪声的同时能保留图像的边缘细节。这些优良特性是线性滤波所不具备的。\n",
    "\n",
    "中值滤波首先也得生成一个滤波模板，将该模板内的各像素值进行排序，生成单调上升或单调下降的二维数据序列，二维中值滤波输出为\n",
    "\n",
    "\n",
    "<div align=center><img   src =\"https://img-blog.csdnimg.cn/2020042802014220.png\"/></div>    \n",
    "\n",
    "<div align=center><img   src =\"https://img-blog.csdnimg.cn/20200428021512767.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzQ0MzE1OTg3,size_16,color_FFFFFF,t_70\"/></div>  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "img_b = cv2.medianBlur(img_n, 9)\n",
    "cv2.imshow('b', img_b);\n",
    "key = cv2.waitKey()\n",
    "if key == 27:\n",
    "    cv2.destroyAllWindows()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4.4.2 双边滤波\n",
    "\n",
    "\n",
    "双边滤波器的好处是可以做边缘保存（edge preserving），一般过去用的维纳滤波或者高斯滤波去降噪，都会较明显地模糊边缘，对于高频细节的保护效果并不明显。双边滤波器顾名思义比高斯滤波多了一个高斯方差sigma－d，它是基于空间分布的高斯滤波函数，所以在边缘附近，离的较远的像素不会太多影响到边缘上的像素值，这样就保证了边缘附近像素值的保存。\n",
    "\n",
    "在双边滤波器中，输出像素的值依赖于邻域像素值的加权值组合：\n",
    "\n",
    "<div align=center><img   src =\"../images/Bilateral1.png\" width=300/></div>    \n",
    "\n",
    "而加权系数w(i,j,k,l)取决于定义域核和值域核的乘积。\n",
    "\n",
    "其中定义域核表示如下（如图）：\n",
    "\n",
    "<div align=center><img   src =\"../images/bilateral2.png\" width=300/></div>    \n",
    "\n",
    "值域核表示为：\n",
    "\n",
    "<div align=center><img   src =\"../images/bilateral3.png\" width=300/></div>    \n",
    "\n",
    "两者相乘后，就会产生依赖于数据的双边滤波权重函数：\n",
    "\n",
    "<div align=center><img   src =\"../images/bilateral4.png\" width=300/></div>    \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "img_b = cv2.bilateralFilter(img, 9, 75, 75)\n",
    "cv2.imshow('b', img_b)\n",
    "key = cv2.waitKey()\n",
    "if key == 27:\n",
    "    cv2.destroyAllWindows()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.21908177282153884"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "a = np.sqrt(11.39/6.77)\n",
    "\n",
    "0.803/a - 0.4\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-0.04429999999999998"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(-0.2473+0.203) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.2742121789286682"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(-0.2473+0.203) /(np.sqrt(1.1) - np.sqrt(0.6))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.23042121789286685"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "0.1*(np.sqrt(1.1) - np.sqrt(0.6))+0.203"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "参考资料：https://github.com/datawhalechina/team-learning-cv/blob/master/ImageProcessingFundamentals/04%20%E5%9B%BE%E5%83%8F%E6%BB%A4%E6%B3%A2.md"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
